SemanticScuttle - klotz.me » Tags: machine learning+llm

Tags: machine learning* + llm*

0 bookmark(s) - Sort by: Date ↓ / Title /

Effortless Spreadsheet Normalisation With LLM

This article describes a workflow using Large Language Models (LLMs) to automate the process of normalising spreadsheet data, making it tidy and machine-readable for easier analysis and insights.

2025-03-15 Tags: data cleaning, data engineering, llm, machine learning, spreadsheet by klotz

Ultrascale Playbook

A comprehensive guide to ultrascale machine learning, covering techniques, tools, and best practices.

2025-03-13 Tags: scale, machine learning, huggingface, production engineering, llm by klotz

Understanding Attention in LLMs

The attention mechanism in Large Language Models (LLMs) helps derive the meaning of a word from its context. This involves encoding words as multi-dimensional vectors, calculating query and key vectors, and using attention weights to adjust the embedding based on contextual relevance.

2025-03-07 Tags: attention, llm, machine-learning, neural networks, nlp, transformers by klotz

Qodo-Embed-1-1.5B

Qodo-Embed-1-1.5B is a state-of-the-art code embedding model designed for retrieval tasks in the software development domain. It supports multiple programming languages and is optimized for natural language-to-code and code-to-code retrieval, making it highly effective for applications such as code search and retrieval-augmented generation.

2025-03-04 Tags: qodo-embed-1, code, embedding, llm, software development, huggingface by klotz

Qodo’s open code embedding model sets new enterprise standard, beating OpenAI, Salesforce

Qodo releases Qodo-Embed-1-1.5B, an open-source code embedding model that outperforms competitors from OpenAI and Salesforce, enhancing code search, retrieval, and understanding for enterprise development teams.

2025-03-04 Tags: qodo, code, embedding, llm, search, retrieval, software engineering by klotz

SmolVLM2: Bringing Video Understanding to Every Device

SmolVLM2 represents a shift in video understanding technology by introducing efficient models that can run on various devices, from phones to servers. The release includes models of three sizes (2.2B, 500M, and 256M) with Python and Swift API support. These models offer video understanding capabilities with reduced memory consumption, supported by a suite of demo applications for practical use.

2025-02-21 Tags: smolvlm2, video understanding, python, machine learning, video, transformers, mlx, vlm, llm by klotz

How might LLMs store facts | Chapter 7, Deep Learning

The article delves into how large language models (LLMs) store facts, focusing on the role of multi-layer perceptrons (MLPs) in this process. It explains the mechanics of MLPs, including matrix multiplication, bias addition, and the Rectified Linear Unit (ReLU) function, using the example of encoding the fact that Michael Jordan plays basketball. The article also discusses the concept of superposition, which allows models to store a vast number of features by utilizing nearly perpendicular directions in high-dimensional spaces.

2025-02-21 Tags: 3blue1brown, llm, facts storage, multi-layer perceptrons, neural networks, deep learning, attention, gpt by klotz

Sawmills emerges from stealth to trim enterprise observability costs and provide telemetry data sovereignty

Sawmills AI has introduced a smart telemetry data management platform aimed at reducing costs and improving data quality for enterprise observability. By acting as a middleware layer that uses AI and ML to optimize telemetry data before it reaches vendors like Datadog and Splunk, Sawmills helps companies manage data efficiently, retain data sovereignty, and reduce unnecessary data processing costs.

2025-02-20 Tags: sawmills, llm, observability, telemetry, splunk, datadog, by leveraging leading large language models (llms) and machine learning techniques, sawmills can drastically cut down the volume of data sent to observability tools, offering substantial cost savings. the platform is built on the opentelemetry collector with additio, enabling better data governance, anomaly detection, . production engineering, otel, machine learning by klotz

Deep-Diving & Decoding The Secrets That Make DeepSeek So Good

The article explores the architectural changes that enable DeepSeek's models to perform well with fewer resources, focusing on Multi-Head Latent Attention (MLA). It discusses the evolution of attention mechanisms, from Bahdanau to Transformer's Multi-Head Attention (MHA), and introduces Grouped-Query Attention (GQA) as a solution to MHA's memory inefficiencies. The article highlights DeepSeek's competitive performance despite lower reported training costs.

2025-02-16 Tags: deepseek, multi-head latent attention, mla, attention, transformer, grouped-query attention, gqa, deep learning, llm by klotz

The Big Book of Large Language Models

A comprehensive guide to Large Language Models by Damien Benveniste, covering various aspects from transformer architectures to deploying LLMs.

- Language Models Before Transformers
- Attention Is All You Need: The Original Transformer Architecture
- A More Modern Approach To The Transformer Architecture
- Multi-modal Large Language Models
- Transformers Beyond Language Models
- Non-Transformer Language Models
- How LLMs Generate Text
- From Words To Tokens
- Training LLMs to Follow Instructions
- Scaling Model Training
- Fine-Tuning LLMs
- Deploying LLMs

2025-02-11 Tags: llm, damien benveniste, machine learning, data science, book by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

Tags: machine learning* + llm*

Linked Tags

Related Tags